An Efficient Mechanism for Matching Multiple Patterns with Streamed Xml Data
نویسندگان
چکیده
Filtering XML data streams using efficient pattern matching algorithms is a fundamental ability for many datacentric applications and main purpose of the Template Matching sPecification Language (TMPL). In this paper extensions to the language are discussed that enable fomulating more powerful query patterns: The declarative type system, improved predicates, template references and sequence matching operators. An optimised matching runtime based on lazy constructed automata is introduced together with an explanation of the underlying formalism. An Example, case studies and performance measurements illustrate the usage and usability of TMPL.
منابع مشابه
Online Dictionary Matching for Streams of XML Documents
We consider the online multiple-pattern matching problem for streams of XML documents, when the patterns are expressed as linear XPath expressions containing child operators (/), descendant operators (//) and wildcards (∗) but no predicates. For each document in the stream, the task is to determine all occurrences in the document of all the patterns. We present a general multiple-pattern-matchi...
متن کاملAdaptive Approximate Record Matching
Typographical data entry errors and incomplete documents, produce imperfect records in real world databases. These errors generate distinct records which belong to the same entity. The aim of Approximate Record Matching is to find multiple records which belong to an entity. In this paper, an algorithm for Approximate Record Matching is proposed that can be adapted automatically with input error...
متن کاملUnordered XML Pattern Matching with Tree Signatures
We propose an efficient approach for finding relevant XML data twigs defined by unordered query tree specifications. We use the tree signatures as the index structure and find qualifying patterns through integration of structurally consistent query path qualifications. An efficient technique is proposed and its implementation tested on real-life data collections.
متن کاملEfficient Evaluation of Multiple Queries on Streamed XML Fragments
With the prevalence of Web applications, expediting multiple queries over streaming XML has become a core challenge due to one-pass processing and limited resources. Recently proposed Hole-Filler model is low consuming for XML fragments transmission and evaluation, however existing work addressed the multiple query problem over XML tuple streams instead of XML fragment streams. By taking advant...
متن کاملRUN, Xtatic, RUN: EFFICIENT IMPLEMENTATION OF AN OBJECT-ORIENTED LANGUAGE WITH REGULAR PATTERN MATCHING
RUN, Xtatic, RUN: EFFICIENT IMPLEMENTATION OF AN OBJECT-ORIENTED LANGUAGE WITH REGULAR PATTERN MATCHING Michael Y. Levin Benjamin C. Pierce Schema languages such as DTD, XML Schema, and Relax NG have been steadily growing in importance in the XML community. A schema language provides a mechanism for defining the type of XML documents; i.e., the set of constraints that specify the structure of X...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006